Conjugate Directions for Stochastic Gradient Descent
نویسندگان
چکیده
The method of conjugate gradients provides a very effective way to optimize large, deterministic systems by gradient descent. In its standard form, however, it is not amenable to stochastic approximation of the gradient. Here we explore ideas from conjugate gradient in the stochastic (online) setting, using fast Hessian-gradient products to set up low-dimensional Krylov subspaces within individual mini-batches. In our benchmark experiments the resulting online learning algorithms converge orders of magnitude faster than ordinary stochastic gradient descent.
منابع مشابه
Extensions of the Hestenes-Stiefel and Polak-Ribiere-Polyak conjugate gradient methods with sufficient descent property
Using search directions of a recent class of three--term conjugate gradient methods, modified versions of the Hestenes-Stiefel and Polak-Ribiere-Polyak methods are proposed which satisfy the sufficient descent condition. The methods are shown to be globally convergent when the line search fulfills the (strong) Wolfe conditions. Numerical experiments are done on a set of CUTEr unconstrained opti...
متن کاملA new hybrid conjugate gradient algorithm for unconstrained optimization
In this paper, a new hybrid conjugate gradient algorithm is proposed for solving unconstrained optimization problems. This new method can generate sufficient descent directions unrelated to any line search. Moreover, the global convergence of the proposed method is proved under the Wolfe line search. Numerical experiments are also presented to show the efficiency of the proposed algorithm, espe...
متن کاملCombining Conjugate Direction Methods with Stochastic Approximation of Gradients
The method of conjugate directions provides a very effective way to optimize large, deterministic systems by gradient descent. In its standard form, however, it is not amenable to stochastic approximation of the gradient. Here we explore ideas from conjugate gradient in the stochastic (online) setting, using fast Hessian-gradient products to set up low-dimensional Krylov subspaces within indivi...
متن کاملAn eigenvalue study on the sufficient descent property of a modified Polak-Ribière-Polyak conjugate gradient method
Based on an eigenvalue analysis, a new proof for the sufficient descent property of the modified Polak-Ribière-Polyak conjugate gradient method proposed by Yu et al. is presented.
متن کاملConjugate gradient neural network in prediction of clay behavior and parameters sensitivities
The use of artificial neural networks has increased in many areas of engineering. In particular, this method has been applied to many geotechnical engineering problems and demonstrated some degree of success. A review of the literature reveals that it has been used successfully in modeling soil behavior, site characterization, earth retaining structures, settlement of structures, slope stabilit...
متن کامل